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ABSTRACT 

Various  types  of  pattern  deformations  are  investigated  from  the  syntac- 
tic point  of  view  and  categorized  into  two  major  types:  local  deformations 
and  structural  deformations.  Random  noise,  distortion  variations,  and  sub- 
stitutions, of  pattern  primitives  belong  to  the  former;  syntactic  errors  due 
to  pattern  structural  changes,  such  as  primitive  deletions  and  insertions, 
belong  to  the  latter.  Every  observed  pattern  can  be  regarded  as  transformed 
from  a pure  pattern  through  these  two  types  of  deformations.  An  error- 
correcting  parsing  scheme  for  local  deformations  optimum  in  the  Bayes  sense 
is  proposed.  A corresponding  recognition  rule  is  then  described,  which  can 
be  regarded  as  a hybrid  classifier  because  it  has  utilized  advantages  of 
both  syntactic  and  statistical  approaches  to  pattern  recognition.  When  this 
scheme  is  applied  to  string  and  tree  languages  without  structural  deforma- 
tions, it  is  shown  that  various  known  structure-preserved  error-correcting 
parsing  schemes  could  be  considered  as  special  cases  of  this  general  scheme. 
Two  structure-preserved  error-correcting  parsers,  one  for  string  languages, 
the  other  for  tree  languages,  are  also  presented.  Finally,  further 
researches  concerning  error-correcting  parsings  for  structural  deformations 
and  a complete  error-correcting  systems  for  both  kinds  of  deformations  are 
suggested. 
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1 . Introduction 

To  recognize  noisy  or  deformed  patterns  using  the  syntactic  pattern 
recognition  approach,  error-correcting  parsing  and  classification  techniques 
using  various  decision  criteria  have  been  proposed  CI-5,20].  Errors  induced 
on  the  primitives  of  noisy  or  deformed  patterns  usually  are  classified  into 
three  types:  substitutions,  deletions,  and  insertions.  If  only  substitu- 
tion errors  are  considered,  the  error-correcting  parser  is  said  to  be 
structure-preserved.  After  an  input  pattern  is  parsed  by  a certain  pattern 
grammar,  a quantitative  measure,  either  deterministic  or  probabilistic,  is 
output  by  the  parser  to  indicate  a measure  of  possibility  that  the  input 
pattern  is  generated  by  the  grammar.  The  decision  criterion  is  then  used  to 
classify  the  input  pattern  as  belonging  to  the  pattern  class  with  an  extreme 
quantitative  measure,  either  minimum  or  maximum,  depending  on  how  the  meas- 
ure is  defined.  Two  most  widely  used  decision  criteria  are  minimum-distance 
and  maximum-likel ihood  criteria,  though  others  have  also  been  proposed 
[2,53. 

Influenced  by  the  linguistic  types  of  representation  which  only  adopts 
symbolic  notations  as  terminals,  most  of  the  existing  error-correcting  pars- 
ing methods  Cl— 4, 203  use  discrete  symbols  to  represent  structural  pattern 
primitives.  However,  it  happens  quite  often  that  a primitive  also  contains 
continuous  semantic  or  numerical  information  useful  for  pattern  discrimina- 
tion purpose  C5,6,7].  For  such  cases,  obviously,  these  parsing  methods  are 
not  appropriate,  because  they  can  not  utilize  continuous  semantic  or  numeri- 
cal information. 

To  take  care  of  both  structural  and  numerical  information  simultaneous- 
ly, a deformational  model  for  pattern  primitives  is  introduced  in  this  re- 
port. Based  on  this  model,  error-correcting  parsing  and  classification 
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techniques  using  the  Bayse  decision  rule  are  then  proposed.  Various  known 
error-correcting  parsing  schemes  and  classification  rules  are  compared  with 
the  proposed  techniques.  A complete  illustrative  example  is  given  to  show 
the  applicability  of  the  proposed  model  and  techniques. 


. 
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2.  Deformational  Model 

In  this  section,  we  give  a formal  description  of  basic  concepts  for  im- 
ages, patterns,  subpatterns,  and  primitives,  which  we  will  call  structural 
entities,  used  in  syntactic  pattern  recognition  from  a broader  point  of 
view,  and  based  on  these  concepts,  we  propose  a deformational  model  which 
will  serve  as  a basis  later  for  developing  a Baye*  error-correcting  recogni- 
tion system.  Essentially,  these  concepts  are  described  as  general  as  possi- 
ble so  that  they  can  be  applied  to  a variety  of  pattern  languages,  and  in 
such  a way  that  discrimination  between  syntactic  and  semantic  informations 
available  from  the  structural  entities  is  emphasized.  In  particular,  exam- 
ples are  given  for  string  and  tree  languages  for  illustrative  purpose. 

2.1  Basic  Concepts 

An  observed  image  usually  can  be  considered  as  deformed  from  a pure 
image.  For  example,  a smooth  shape  in  a picture  may  become  noisy  after  it 
is  digitized.  Here  the  original  shape  is  the  pure  image  and  its  noisy  ver- 
sion is  the  observed  image.  When  similar  pure  images  are  clustered  as  a 
pure  pattern  class,  there  corresponds  a set  of  observed  images  each  of  which 
we  will  call  as  an  observed  pattern.  In  practical  applications,  grammars 
are  often  inferred,  either  from  pure  or  from  observed  patterns,  to  recognize 
observed  images.  In  some  simple  cases,  the  deformations,  such  as  noises, 
existing  in  observed  patterns  can  be  eliminated  by  intensive  preprocessing 
such  as  thresholding.  But  in  general,  they  can  not  be  eliminated  entirely. 
This  is  why  error-correcting  parsings  are  necessary. 

Before  a class  of  patterns  can  be  described  by  a pattern  grammar,  each 
pattern  is  decomposed  into  smaller  and  simpler  structural  units  called 
primitives.  Primitives  should  be  chosen  properly  so  that  the  resulting 
descriptions  of  the  patterns  using  grammars  can  be  simple  C7D.  We  call  the 
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description  of  a pattern  using  some  fixed  primitives  as  a structural 
representation/  which  is,  for  string  languages,  a string  (representation) 
consisting  of  symbols  each  of  which  corresponds  to  a primitive,  and  is,  for 
tree  languages,  a t ree  (representation)  with  each  of  its  nodes  corresponding 
to  a primitive.  Of  course,  pure  primitives,  pure  patterns,  and  pure  struc- 
tural representations  also  have  their  corresponding  observed  primitives,  ob- 
served patterns,  and  observed  structural  representations,  respectively. 

2.2  Primitives 

A detailed  study  of  various  kinds  'of  primitives  used  for  pattern 
descriptions  [7-9]  reveals  that  each  primitive  may  contain  two  kinds  of  in- 
formation, namely,  the  syntactic  information  and  the  semantic  information. 
The  syntactic  information  gives  a structural  description  of  the  primitive, 
and  the  semantic  information  provides  the  meaning  or  numerical  description 
of  the  primitive.  To  be  more  specific,  two  examples  are  given  in  the  fol- 
lowing for  illustrative  purpose. 

I.  Primitives  for  string  languages  A primitive  for  string 

languages  usually  is  simply  a symbol.  Different  symbols  are  used  to 
represent  different  primitives,  such  as  an  arc,  a straight  line  segment,  an 
angle,  etc.,  for  describing  shape  boundaries.  But  it  happens  quite  often 
that  we  need  more  information  involving  numerical  measurements  to  describe  a 
primitive  more  accurately.  For  example,  we  may  want  to  discriminate  two  arc 
primitives  by  their  lengths  and  curvatures.  Then,  the  syntactic  information 
contained  in  these  two  primitives  is  the  arc  structure,  and  the  semantic  in- 
formation is  their  respective  lengths  and  curvatures.  You  and  Fu  [93  used 
two  kinds  of  primitives  - curve  segment  primitives  and  angle  primitives  - to 
describe  shapes.  The  first  one  is  a curve  segment  with  4 numerical  features 
to  describe  its  direction,  length,  curvature,  and  symmetry.  The  second  one 
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is  an  angle  with  one  feature  to  describe  the  angle  amplitude.  These  two 
kinds  of  primitive  serve  as  a very  good  example  for  illustrating  the  above 
concept  of  primitive  information. 

II.  Primitives  for  tree  languages  Similarly,  a primitive  for  tree 

languages  may  have  any  kind  of  primitive  structure  and  various  kinds  of  nu- 
merical measurements  on  the  primitive.  For  example,  Lu  and  Fu  CIO]  used  a 
pixel  with  it  gray  value  as  a primitive  to  set  up  a tree  model.  Then  the 
primitive  structure  is  a pixel  and  the  semantic  information  is  its  gray 
value. 

Now  we  are  ready  to  give  a formal  description  of  a primitive.  We  con- 
sider a primitive  a,  either  pure  or  observed,  as  a 2-tuple 

a = (s,x) 

where 

s is  a syntactic  symbol  densting  the  primitive  structure  of  a,  and 

x = Cx.,x-j,...,x  ] is  an  m-dimensional  semantic  vector  with  each  x.  (i  = 
1,2,...,m)  denoting  a numerical  measurement  or  a logical  predicate, 
and  m 0.  When  m = 0,  or  no  semantic  information  is  available,  set 
x = <(>  (empty  vector) . 

A similar  idea  was  also  proposed  by  Shaw  [21]  and  described  in  Fu  C7]. 

Two  remarks  are  in  order. 

I.  Influenced  by  the  linguistic  represenations,  the  primitives  used  in 
syntactic  pattern  recognition  tend  to  be  restricted  to  symbolic  notations 
which  essentially  only  give  syntactic  information.  Even  when  a continuous 
type  of  numerical  information,  such  as  random  noise,  is  included  in  the 
primitives,  it  is  often  thresholded  into  discrete  numericals  which  then  are 
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denoted  by  a finite  number  of  primitive  symbols.  Such  an  approach  not  only 
decreases  the  discrimination  accuracy  due  to  the  numerical  thresholding  but 
also  increases  the  number  of  grammar  rules  due  to  the  increase  of  the  number 
of  primitives  (i.e.  terminals).  With  a primitive  described  as  above,  such 
weaknesses  could  be  eliminated  as  will  be  seen  later. 

II.  Since  a primitive  contains  two  parts  of  information,  we  obtain  a 
great  deal  of  flexibility  in  selecting  primitives.  This  point  is  also  em- 
phasized in  [63.  Any  structural  unit  can  be  selected  as  a primitive,  and  if 
more  properties  are  needed  to  specify  the  primitive,  numerical  or  semantic 
information  can  be  invoked.  Furthermore,  with  semantic  information  separat- 
ed from  syntactic  information  in  a primitive,  a very  systematic  deformation- 
al  model  can  be  developed  for  optimum  error-correcting  parsing  schemes  which 
will  be  described  in  the  following  sections. 

2.3  Pattern  Structures 

To  transform  a pattern  into  a structural  representation  using  primi- 
tives as  constructing  units,  we  need  a fixed  constructing  rule  which  we  will 
call  a pattern  structure.  For  example,  to  convert  a shape  into  a string 
representation  with  arcs,  line  segments,  angles  as  primitives,  we  have  to 
know  the  starting  primitive  and  the  direction  the  shape  boundary  should  be 
traced.  So  a string  structure  is  needed.  Similarly,  a tree  structure  is 
needed  to  convert  the  set  of  primitives  of  a given  pattern  into  a tree 
representation  (for  example,  see  C5,19D).  So  a structural  representation  of 
a pattern  can  be  considered  as  the  arrangement  of  primitives  according  to  a 
fixed  pattern  structure.  Usually,  in  practical  applications,  the  number  of 
pattern  structures  used  by  a pattern  language  is  finite  and  not  too  large. 
In  some  cases,  there  is  even  only  one  such  structure  used  for  all  structural 
representations  C5,10D . For  string  languages,  strings  with  different 
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lengths  are  of  different  string  structures,  and  for  tree  languages,  trees 
with  different  number  of  nodes  or  different  connecting  branches  are  also  of 
different  tree  structures.  But  the  number  of  primitives  existing  in  a 
structural  representation  is  not  the  only  discriminant  factor  of  pattern 
structures.  In  some  cases,  different  implicit  relations  implied  by  the  con- 
catenations in  a string  or  by  the  branches  in  a tree  also  define  different 
pattern  structues,  although  such  relations  may  be  represented  explicitly  by 
terminals  by  some  pattern  languages  such  as  PDL  and  PLEX  languages  C21,223. 

Now  we  can  say  that  a pattern  class  consists  of  a set  of  patterns  each 
of  which  in  turn  can  be  transformed  into  a structural  representation  using  a 
set  of  prespecified  primitives  (and  relations)  according  to  one  of  some 
fixed  pattern  structures  for  this  pattern  class.  These  structural  represen- 
tations can  then  be  used  to  infer  a pattern  grammar  to  characterize  this 
pattern  class.  So  each  terminal  used  in  the  grammar  is. just  a primitive 
which  can  be  described  by  a 2-tuple  consisting  of  a syntactic  symbol  and  a 
semantic  vector  as  defined  in  Section  2.2. 

2.4  The  Deformational  Model 

From  previous  discussions,  it  is  clear  that  a pattern  or  its  structural 
representation  u>  can  be  fully  characterized  by  a 2-tuple  w = (S,A)  where 
A = -Ca^  | i = 1,2,...,n>  is  a set  of  primitives  used  in  u>  and  S denotes  the 
pattern  structure  of  u > together  with  implicitly  assumed  relations  among  the 
primitives.  For  discussion  convenience  in  the  following  sections,  we  assume 
that  the  subscripts  for  a^  are  numbered  according  to  some  fixed  order  which 
is  determined  by  the  pattern  structure  S;  when  S is  fixed,  then  this  order- 
ing is  also  fixed. 

Given  the  structural  representation  w = (S,A)  of  a certain  pure  pattern 
with  pattern  structure  S and  primitive  set 


- 10  - 


A = 'Ca . | 


ai 


= <s.,Xi>,  Xi  = Cx.,, 


xi  2'  ‘ 


•'xiN.)'  Ni  - 


0,  i=1,2,...,n>. 


the  structural  representation  of  its  corresponding  observed  pattern 
w'  = (S', A'),  with  pattern  structure  S'  and  primitive  set 


A,=-Ca'  - 1 a ' i = (s'.,x'.>,  x • i = CX' 


i1'x  i 2* 


,'x'iN'.)'  N'i  - °' 


can  be  considered  as  being  transformed  from  w through  a series  of  deforma- 
tions. Our  deformational  model  categorizes  all  possible  deformations  int 
two  major  types:  structural  deformations  and  local  deformations. 

I.  Local  deformations  If  S = S',  but  for  some  i,  i = 1,2,. ..,n, 

a^  # a'.,  then  we  say  to'  is  deformed  local ly  from  w.  In  another  word,  a 
local  deformation  induced  on  a pure  pattern  preserves  the  entire  pattern 
structure  but  deforms  some  primitives  local ly.  So  a local  deformation 
is  also  called  a structure-preserved  deformation.  With  respect  to 
strings,  this  simply  means  a length-preserved  deformation. 

II.  Structural  deformations  If  S * S',  then  we  say  that  w'  is  deformed 

structural ly  from  w.  Various  types  of  structural  deformations,  such  as 
insertions,  deletions,  transpositions,  and  permutations  [11,2,12],  have 
been  defined  according  to  various  kinds  of  structural  difference  between 
S and  S' . 

In  this  report,  we  deal  only  with  local  deformations,  leaving  structur- 
al deformations  for  further  investigations. 

2.5  Local  Defo rmations 

A deformation  induced  on  at  least  one  primitive  of  a given  pure  pattern 
is  called  a local  deformation.  Let  a^  = (s^,X^)  be  the  pure  primitive  de- 
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formed,  where 


Xi  = (xi1'xi2'""xiNi>' 


and  c.  = (t.,z.)  be  one  of  its  observed  versions,  where 
i i'  i 


zi  = (zi1'zi2'*"'ziN'  .)* 

i 


At  least  two  types  of  local  deformations  can  be  identified  as  following: 

I.  Syntactic  local  deformation This  is  the  case  when  t..  # s...  In 

another  word,  when  the  primitive  structure  is  changed  to  another  one,  a 
syntactic  local  defo rmation  is  induced,  which  usually  is  called  a 
substitution  error. 

I.  Semantic  local  deformation  When  the  local  deformation  on  a^ 

does  not  change  the  primitive  structure  but  only  corrupts  the  semantic  in- 
formation, i.e.  when  t.  = s.  but  z.  # x.,  then  it  is  called  a semantic  local 
de formation.  If  every  primitive  used  by  a pattern  has  an  identical  primi- 
tive structure,  then  every  local  deformation  is  semantic. 

In  general,  we  can  consider  a local  deformation  as  a two-step  transfor- 
mation from  a^  = (s.,x.)  to  c..  = (t..,z^)  by  the  following  way: 


^si'xi*  synt.loc.del 

I 


sem.loc.dei 


(W 


pure  primit.a.. 


semi-pure  primit.b. 


observed  primit.c. 


where  b^  = (t^,y^),  called  a semi-pure  primitive,  is  created  to  denote  one 
of  the  syntactically  local -deformed  versions  of  (s.,x.)  with  y^  being  a 
representation  semantic  vector  for  t^,  which  is  only  created  for  explanatory 
convenience  and  does  not  have  much  practical  use  later  in  our  derivation  of 
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parsing  procedures. + When  = s^,  then  = x.j,  and  only  semantic  local  de- 
formations happen  in  the  two-step  transformation. 

Let  A = -Ca^|a^  = (s.,x.),  i = 1,2,. ..,n>  denote  all  the  pure  primitives 
used  in  a pure  pattern.  Though  each  a.,  can  be  deformed  syntactical  l y into  a 
set  of  semi-pure  primitives  D=  = <b - - 1 b - - = (t-.,y..),  j = 1,2,. . .,k .}, 

d,.  IJ'J  'Jlj  * 

each  deformation  a.  ♦ b.j  may  occur  with  a different  probability.  So  there 

exists  a conditional  probability  function  p defined  on  D for  each  a-  such 

d • i 

1 

that  p(b^ja^)  = p(t-j|s^)  is  the  probability  for  to  be  deformed  into 

t.^,  j = 1 ,2, ... ,k - . Similarly,  since  each  b^j  can  be  deformed  semantically 
into  a set  of  observed  primitives  D.  = <c...|c...  = (t..,z...), 

D.j  j IJK  IJK  1 J 1 J K 

‘ where  R . . is  a range  for  z...,  which  may  consist  of  a finite 

number  of  discrete  elements  or  of  an  infinitive  number  of  continuous  ele- 
ments, we  can  define  a conditional  probability  or  density  function  q on 

ij 

such  that  q(z. .. |b-4,a.)  = q(z. .. 1 1 - - ,s - ) is  the  probability  or  density  for 

IJK  I J I 'JK  ij  I 

b..  = (t..,y..)  to  be  deformed  into  c...  = (t..,z...).  Therefore,  from  a 

Ij  ' J ' J *jK  ' J ' J * 

probabilistic  point  of  view,  a local  deformation  from  a^  = (s^,x.)  to 

cijk  = ^tij'2ijk^  now  can  be  interPreted  as  following: 

pCti j I si)  q(2iikltii'si) 

a ” *Si'xi^  synt.loc.def.^i j ~ ^tij'yij^  sem.loc.def.  *Cijk  “ ^tij,2ijk^/ 

where  p(»|s^)  is  the  conditional  probability  function  given  a.,  (or  s.)  de- 
fined on  D , and  q< * 1 1 - - ,s . ) is  the  conditional  probability  or  density 

a.j  i j i 

function  given  a^  and  b..  (or  s.,t..)  defined  on  Db  . We  also  assume  that 

J °ij 

ai  * Da.'  and  bij  * Db..* 

i ij 

To  be  more  specific,  we  give  two  examples  for  the  semantic  local  defor- 
mations, assuming  no  syntactic  local  deformation  is  involved  — that  is, 

tSometimes  for  normally  distributed  z.f  y^  can  be  conveniently  chosen  to 
be  the  mean  value  of  z.. 
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q(2ij|Si) 

ai  “ ^i'*^  sem.loc.def/  cij  “ tsi'zijJ  * 

I.  Random  noise  This  is  the  case  when  the  semantic  vector  in  a 

pure  primitive  a.  = (s^x,.)'  is  subject  to  random  noise  corruption.  So  the 

deformed  or  noisy  version  of  x.,  denoted  as  2..  above,  is  actually  a random 

i'  ij  

/ 

vector  zi j with  continuous  density  function  q(*|s.).  If  the  noise  associat- 
ed with  is  normally  distributed  with  zero  mean,  then  x.  in  fact  is  just 
the  mean  vector  of  j:. ^ , or  x^  = j>. 

II.  Distortion  variations  In  some  cases,  x^  may  be  deformed  into 

only  a finite  number  of  observed  versions  2^.  Then  q(*|s^>  above  is  just  a 
discrete  probability  function  defined  on  all  possible  z^. 

Back  to  our  discussion  of  two-step  local  deformations,  given  a pure 
primitive  a^  = (s.,x.),  the  probability  that  it  is  deformed  locally  into  an 
observed  primitive  c...  = (t..,z...)  now  can  be  computed  as 

ljK  1 ] 1 J K 

,(c  |a  > *11.  pttjjls,)  • q<2  |t  ,s  > • az 
4zijk*° 

if  q(*|t^,s.)  is  a continuous  density  function,  or  simply 


y( c . . . | a . ) = p(t . . I s .)q(z . . . 1 1 . . ,s  .) 
ijk1  i ij 1 i t J k 1 l j i 

if  q(*|t^,sj  is  a discrete  probability  function.  And  given  a pure  pattern 
ai  = ( S, A ) with  A = -Ca^|a.  = (s.,x.),  i = 1,2,. ..,n>,  the  probability  that  10 
is  deformed  locally  into  a structure-preserved  observed  pattern  u’  = (S,C) 
with  C = <c.|Ci  = (t.,2.),  a.  |oc>derci  , i = 1,2,...,n>  is 


P(w' |u)  = ij  y(C. | a . ) 

i=1  1 1 
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n 

= n lim  P^ls.Jqtz^t.^)  • Az.  , 
i=1  Az.-O 


when  q< • | t ^ ) is  continuous,  or. 


n 

P(u.'|w)  = I]  p(t.|s.)q(z.|t.,s.)  , 


when  q(*|t .)  is  discrete,  if  each  a.  is  deformed  independently  into  c.,  i = 
1,2, ...,n.  such  independence  assumption  for  local  deformations  of  primi- 
tives was  also  considered  by  Grenander  Cl 33 , Kovalevsky  Cl 43,  and  Fung  and 
Fu  C3D . 


3 . Bayes  Structure-Preserved  Error-Correcting  Parsers 

In  this  section,  we  derive  structure-preserved  error-correct  parsers 
( SPECP)  optimum  in  the  Bayes  sense  for  locally  deformed  patterns.  Given  a 
pattern  class  consisting  of  various  pure  patterns  which  can  be  generated  by 
a pattern  grammar,  we  can,  from  statistical  point  of  view,  consider  each 
pure  pattern  together  with  all  its  possible  locally  deformed  versions  as  a 
distinct  subclass  of  the  given  pattern  class.  Then  the  SPECP  to  be  derived, 
which  we  will  call  Bayes  SPECP,  are  optimum  in  the  sense  that  they  are,  in 
addition  to  possessing  syntactic  parsing  capability,  just  Bayes  subclass 
classifiers  which  assign  each  given  observed  pattern,  according  to  Bayes  de- 
cision rule,  to  a subclass  whose  pure  pattern  has  a maximum  probability  to 
be  deformed  into  the  given  observed  pattern. 


3.1  Statistical  Considerations 

Given  an  observed  pattern  u>  = (S,A)  with  A = taja^  = (s^,x^),  xi  = 
(xi 1'xi 2' ’ " *'xi L ^ = 1/2,. ..,n>  of  a certain  pure  pattern  class  C which 
consists,  for  simplicity,  of  only  two  pure  patterns  o>^  = (S,B^)  and 
<*>2  ~ (S,B2)  with  B^  - -Cb.  | b.  = (t.,y^),  y^  = (y^^y^^/*”*/^  ■j^/  ^ — 

1.2.. . .,n>  and  B2  = Cb?| b?  = <t?,y?>,  y?  = (y?ry?2,. . .,y2  2>J  i = 

1.2.. .,n>,  we  want  to  assign  u>  to  one  of  the  two  pure  pattern  subclasses  (d^ 
and  u2  according  to  the  statistical  hypothesis  testing  theory.  Using  the 
Bayes  decision  rule,  we  get,  according  to  the  analysis  for  the  deformat ional 
model  in  the  last  section  under  the  independence  assumption  for  local  defor- 
mat ions. 


P(u1 | u)  ^ 


P(u>z  | <d)  > 


1 decide 


or 
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P(u  | d)^  )P(Wl  ) 
P(u>|c>2)P(<<)2) 


n y <a . | b : ) 

n 1 1 


i=1  yta^bf) 


P(lli1 ) 
P(w2) 


^ P<silt1)q(xi.|si,t]) 


i=1  P(s^  | t?)q(x.j  | s . ,t?) 


PU^> 
P(a>2)  < 


1 decide 


,J1 


or  taking  logarithms. 


n 1 . 

Y Cm  p(s.|t.)  + in  q(x.|s.,t:)D  + inP(u>.,) 
i=1  11  tit  1 


Y p(s.|tf)  + m q(x . | s . ,t  7)D  + mP(u,) 

v i=1  11  ill  i 


decide  m ♦ 


where  PU^liu),  P((i)^|u),  P(u^),  P(oj2)  are  posteriori  and  a priori  probabili- 
ties for  pure  pattern  subclass  and  u>2,  and  p(*|d),  qC’Is^t]),  j = 1,2, 
are  as  defined  in  the  last  section.  When  the  pure  pattern  class  C consists 
of  more  than  two  patterns,  the  above  decision  rule  can  be  extended  as  fol- 
lowing. Let  be  such  that 


- in  X . = 


i - ~ Y Cln  P(s.|t^)  + in  q(x . | s . ,t] )]  - mP(u.) 
1 i=1  11  ill  j 


j = 1/2,. ..,M,  with  M,  either  finite  of  infinite,  being  the  total  number  of 
pure  patterns  belonging  to  C,  then  decide  u ♦ if  k is  such  that 


hop 
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- m A.=  min  (-  in  A.)  . 
j=1,2,...,M  ] 

I l 

We  call  the  term  -In  A j the  Bayes  distance  B(u),aij)  from  oi  to  u^,  and  the 
term  -in  A^  the  minimum  Bayes  distance  B(w,Q  from  ui  to  pure  pattern  class 
C. 

With  Bayes  distances  defined  as  above,  the  Bayes  SPECP,  constructed 
from  the  pattern  grammar  Gc  for  a given  pure  pattern  class  C,  is  used  to 
search  for  a given  input  observed  pattern  o>  a pure  pattern  accepted  by  Gc 
with  a minimum  Bayes  distance  Bfw,^)  = 8(w,C)  during  the  error-correcting 
parsing.  So  our  problem  now  is  reduced  to  how  to  compute  the  Bayes  dis- 
tances -in  A j during  the  parsing  procedure.  Since  the  parsing  is  done  on 

each  primitive  at  least  once,  there  is  no  problem  in  obtaining  the  first 
n 

term  Z Cp(s.|tJ.)  + in  q(x.|s.,t^)]  in  -In  A.,  as  will  be  seen  later.  But 

..  vi  i 1 v i y 

how  to  get  the  a priori  probability  P(uO  for  the  pure  pattern  ok  during  the 
parsing  procedure  is  on  the  contrary  not  so  obvious.  The  solution  is  to  use 
a stochastic  grammar  for  the  pattern  class  C. 

3.2  Stochastic  Grammars  for  Computing  Pattern  Probabi l ities 

Stochastic  grammars  have  been  introduced  to  take  care  of  noisy  patterns 
and  also  to  specify  the  probability  of  occurrence  for  each  pattern  accepted 
by  the  pattern  grammars  C7D.  The  latter  property  is  exactly  what  we  want 
for  computing  pattern  probabilities  P(uj). 

To  be  more  specific,  a stochastic  grammar  is  a grammar  each  of  whose 
production  rules  is  associated  with  an  occurrence  probability.  When  a sto- 
chastic pattern  grammar  is  used  to  generate  the  structural  representation  of 
a given  pattern,  a pattern  occurrence  probability  is  also  generated  simul- 


taneously, which  is  the  product  of  all  probabilities  associated  with  the 
production  rules  used  in  deriving  the  structural  representation.  For  de- 


- 
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tails,  see  Fu  C7].  And  for  inference  of  production  rule  probabilities,  see 
Lee  and  Fu  C153.  Here  we  only  give  the  basic  notations  and  definitions  of 
stochastic  context-free  grammars  and  stochastic  tree  grammars  C7D. 

Definition  1.  A stochastic  context-free  (string)  grammar  is  a 4-tuples 

Gs  = <vVps,s>,  where 

VN  is  a finite  set  of  non-terminals, 

VT  is  a finite  set  of  terminals, 

S is  a start  symbol. 
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Definition  3.  A stochastic  tree  grammar  over  <VT,r>  in  its  expansion  form 

is  a 4-tuple  Gt  = (VN  U V T,  r ,P  ,S),  where 

VN,  VT,  S are  the  same  as  defined  in  Definition  1, 

r:  VT  ♦ N,  the  set  of  nonnegative  integers,  is  a rank  function  denoting  the 
number  of  direct  descendants  of  a node  with  a symbol  in  VT  as  its  label,  and 
Pisa  set  of  stochastic  production  rules,  each  of  which  is  in  the  form 


/ \ 

Xi  1 *"  Xir(x) 


where  x * V-p,  X^,  X- • • «,X. ^ ^ — ^*i#  ^ i A,  n.j,  P-jj 

the  same  as  defined  in  Definition  1,  and 


are 


0 < Pij  < 1 


and 


i 

£ p 

j=i 


i] 


= i 


3.3  Bayes  SPECP  for  String  Languages 

We  describe  in  the  following  a Bayes  SPECP  for  context-free  string 

languages.  Given  a stochastic  context-free  string  grammar  G = (V..,VT,P  ,S) 

S N I S 

for  a pure  pattern  class,  assume  that  the  terminal  set  VT  = {a^|a^  = 

(t.,w.),  i = 1,2,...,l>  contains  all  possible  pure  primitives  used  by  the 


pure  patterns.  For  each 

a..,  i = 1,2,. ..,1,  let 

p(- 

■|a.)  = pC  * 1 1 i 

) 

be 

the 

conditional  probability 

function  defined  on 

D 

ai 

= -Cb. . |b. . = 
i]  iJ 

(u 

ij'X- 

ij}/ 

a . i v»b . . , i = 1 

i syn.loc.def  ij' 

l,2,..,k..},  and  q(*|a^,b^ 

.) 

J 

= qC|t.,u.. 

) 

be 

the 

conditional  probability  or  density  function  defined  on  D.  = -Cc...|c...  = 

D..  'jK 


(uij'2ijk>'  bij 


sem.loc.def.  ijk'  ijk 


V* 


Let 
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= U [ u 
i=1  j=1 


4 


denote  all  possible  deformed  primitives,  and  note  that  VT  cz  \l^.  The  algo- 
rithm for  the  Bayes  SPECP  is  a modification  of  the  Cocke-Yonger-Kasami  pars- 
ing scheme  C16D,  which  essentially  tries  to  construct  a parse  table  T for  an 
input  observed  string  representation  y,  and  then  parses  through  the  table  to 
obtain  a pure  string  representation  x with  a minimum  Bayes  distance  B(y,x). 

The  table  T consists  of  entries  t..,  1 < i < n,  1<j<  n-i+1,  where  n is 

i j — — — — 

the  length  of  string  y.  Each  t..  is  a set  of  triplets  (A,d,k),  where  A « V.. 

1 J N 

is  an  intermediate  nonterminal  used  in  deriving  x,  d t (0,®)  is  part  of  the 
Bayes  distance,  and  k specifies  the  product  rule  used  with  A at  the  left- 
hand  side. 


Algorithm  1.  Bayes  Structure-Preserved  Error-Correcting  Parser  for  String 
Languages 

Input ; A stochastic  context-free  string  grammar  Gs  = (VN,VT,Ps,S)  in  Chom- 
sky normal  form  without  f-productions,  and  an  observed  string  representation 

• * 

y € VT  , y = c^C2-.-cn,  c^  = (s^,x.),  i = 1,2,...,n. 

Output : A pure  string  representation  x accepted  by  Gg  with  a minimum  Bayes 
distance  B(y,x). 

P 

Method;  Put  all  production  rules  into  order  and  let  k:  A •*  a denote  that 
P 

A * o is  the  kth  rule  in  P . 

s 

Step  1^  Construct  t.^  for  each  i,  i = 1,2, ...,n.  Let  A « VN>  For  every 
kj : A ♦ 3^  in  Pg,  j = 1,2,...,n^,  where  a^  = (t^,Wj),  n^  is  the  number  of 
production  rules  each  with  A on  the  left-hand  side  and  a terminal  on  the 
right-hand  side,  let 


: 
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d—  = - Un  pCs^tj)  + in  q(xi|tj,si>  + In  Pj]  , 
i = 1,2,. ..,n.  Then  set 


t..  = C(A,  d , k ) |d.  = 

l 1 it  tit 


min 


j— 1,2,... ,n , 


d..  , 
i] 


A t 


V 


Step  2.  Construct  t„,  3 = 2,...,n,  inductively.  Assume  that  t„,  has  been 

computed  for  all  i,  1 _<  i _<  n,  and  for  all  j',  1 _<  j * < j.  For  every 

3 ii 

kj : A ♦ Bj^j'  -1  = 1*2,. ..,n^,  where  n^  is  the  number  of  production  rules 

with  A on  the  left-hand  side  and  two  nonterminals  on  the  right-hand  side,  if 

there  exists  some  m,  1 < m < j,  such  that  (B. ,e  ,h  .. ) « t.  and 

— — 33I3I  im 

<cj'e)2'V  * '**  eij  = ej1  * ej2  * ,n  pi  • The"  set 

*1j  = «*»•„. kt»l«u  * *,n  , eij  ' # ' V 

3— 1,2,...,n^ 


Step  3.  Repeat  Step  2 until  t.^  is  computed  for  all  1 _<  i <_  n and 
1 _<  j £ n-i+1. 

Step  4.  When  the  entire  table  T is  completed,  exam  entry  t^  . If  there  ex- 
ists a triplet  (S,d,k)  in  t^n  for  some  d and  k,  then  set  B(y,x)  = e d,  and 
the  desired  pure  string  representation  x can  be  easily  traced  out  from  the 
parse  table  T,  starting  from  the  kth  production  rule.  If  no  (S,d,k)  exists 
in  t^n,  then  input  observed  string  representation  y is  not  structure- 
preserved;  set  B(y,x)  = 0. 

3.4  Bayes  SPECP  for  Tree  Languages 

Using  the  minimum-Bayes-di stance  criterion  again,  we  propose  a Bayes 
SPECP  for  tree  languages  in  the  following.  Given  a stochastic  tree  grammar 
Gs  = (VNUVT,r,Ps,S)  over  <VT,r>  in  its  expansive  form,  let  VT, 
P<  • I ai  > = q(*|a.,b^j)  = qC'h^u^),  l>a  / , and  Vy  be  all  the 


same  as  those  defined  in  Sec.  3.3.  The  algorithms  for  the  Bayes  SPECP  fol- 

I 

lows  the  concept  of  tree  automata  C173,  and  is  a backward  procedure  for  con- 
structing a tree-like  transition  table  T for  an  input  observed  tree 
representation  B.  Let  the  tree  structure  (i.e.,  the  tree  domain)  of  B be 
denoted  as  D„,  then  corresponding  to  each  node  b in  D is  an  entry  t.  in  T, 

which  consists  of  a set  of  triplets  (A,d,k),  where  A t Vtl  is  a candidate 

N 

state  for  node  b,  d is  part  of  the  Bayes  distance,  and  k specifies  the  pro- 
duction rule  used  with  A at  the  left-hand  side. 

Algorithm  2.  Bayes  Structure-Preserved  Error-Correcting  Parser  for  Tree 
Languages 

Input : A stochastic  tree  grammar  Gg  = (VNUVT,r,Ps,S)  over  <VT,r>  in  its  ex- 
pansive form,  and  an  observed  tree  representation  8 with  B(b)  = ^Sb,xb>  as 

I 

its  observed  primitive  at  node  b,  <sb,xb>  * VT- 

Output;  A pure  tree  representation  a accepted  by  Gg  with  a mininum  Bayes 
distance  B(6,a). 

Method;  Let  tb>^  denote  the  set  of  triplets  corresponding  to  the  ith  descen- 
dant of  node  b. 

Step  1.  For  each  node  b in  8 such  that  rCB(b)]  = 0,  add  to  tfa  a liiplet 
(A,d,k)  with 

d = - [In  p( sb 1 1 + tn  q(xb|tk,sb>  + in  pkD 
if 

pk 

A - ak 

with  ak  = (tk,wk)  is  the  kth  production  rule  in  Pg. 


- 23  - 


Step  2.  For  each  node  b in  B such  that  rCB(b)]  = N * 0,  add  to  t^  a triplet 
(A,ds,k)  with 


if 


d$  = - Cm  p(sb|tk)  + in  q(xb|tk,sb)  + In  pfc] 


+ d1  + d2  + ...  + dN 


A ♦ ak 

/ \ 

Ar**AN 


with  a.  = (t.,w.)  is  the  kth  production  rule  in  P and  (A..d.,k.)  « t.  ,, 
k k'  k s V Y 1 b*1' 

(A2,d2,k2)  € tb<2  <AN/dN/kN>  « tb>|(. 

Step  3.  For  any  two  triplet  (B^d^,^),  CB2,d2,k2>  in  each  t^,  delete  the 
former  if  d^  d2,  or  the  latter  if  d^  < d2» 

Step  4.  Repeat  Steps  1-3  until  all  nodes  in  6 have  been  processed. 

Step  5^.  Exam  tg,  the  root  entry  of  the  transition  table  T.  If  (S,d,k)  * tg 
for  some  d and  k,  then  set  8(6,0)  = e and  the  desired  pure  tree  represen- 
tation  a can  be  easily  traced  out  from  T,  starting  from  the  kth  proojction 
rule  in  Ps.  If  no  (S,d,k)  exists  in  tg,  then  the  input  observed  tree 
representation  6 is  not  structure-preserved;  set  B(B,a)  = 0. 


3 . 5 Comments  on  Various  SPECP  and  Least-Square-Error  Pi  stance  Criteria 

Fung  and  Fu  C33  have  proposed  a maximum-l i kel  ihood  SPECP  fot  i.ty 
languages,  but  the  grammars  used  are  nonstochastic,  so  their  SPECP  is  just  a 
suboptimum  one  under  the  assumption  that  all  pattern  subclasses  occur  with 
an  equal  probability.  SPECP  using  stochastic  grammars  has  been  proposed  by 
Fur a and  Fu  C183,  Lu  and  Fu  CIO, 203,  and  Thompson  C2D,  but  from  the  view 
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point  of  our  deformat ional  model,  their  SPECP  for  substitution  error  only 
takes  care  of  syntactic  local  deformations,  and  so  limit  their  applicability 
to  pattern  classification  problems  where  the  semantic  information,  especial- 
ly when  it  is  continuous,  is  contained  in  the  pattern  primitives  for 
discrimination  purpose.  Of  course,  these  SPECP  still  can  be  used  to  handle 
continuous  types  of  semantic  information  by  thresholding  them  into  finite 
discrete  cases,  but  obviosuly  this  will  decrease  the  error-correcting  capa- 
bility of  the  SPECP,  as  mentioned  previously  in  Sec.  2.2,  and  as  will  be 
shown  by  an  example  in  Sec.  4.1. 

Next,  SPECP  for  string  and  tree  languages  using  the  minimum-distance 
criterion  have  also  been  proposed  [1,4].  In  addition  to  being  limited  to 
syntactic  local  deformations,  these  SPECP  are  statistically  optimum  only 
under  very  special  conditions,  although  they  are  convenient  and  important  in 
practical  applications  when  deformation  probabilities  or  density  functions 
are  difficult  to  infer. 

Finally,  we  propose  in  the  following  a new  criterion,  namely,  the 
l east-square-error  (LSE)  distance  criterion  for  the  SPECP,  which  is  a spe- 
cial case  of  the  minimum-Bayes-distance  criterion  but  is  useful  for  semantic 
local  deformations. 

It  happens  sometimes  that  the  observed  semantic  vector  in  a primitive 
is  normally  distributed,  especially  when  it  is  computed  with  random  %oise. 
Assuming  that  no  syntactic  local  deformation  involves,  we  want  to  derive  the 
Bayes  distance  between  a pure  pattern  cu  = (S,B)  and  one  of  its  normally  a:- 
formed  observed  patterns,  w'  = (S,A).  If  A = fa^|a^  = (s-,Xj), 

xi  = <*iV*i2'*‘*'xiN>'  1 = and  B = tbilbi  = (s-'wj)' 
w.  = (wi1,wi2,...,wiN>,  i = 1,2, ...,n>,  and  assume  the  following  conditions: 
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(1)  Component  random  variables  x„  of  x^  are  all  independent  with 
mean  w..^.,  j = 1,2,. ..,N.  An  example  for  this  case  happens 
when  every  x..  is  corrupted  with  random  noise  with  zero  mean. 

(2)  x.j  is  distributed  according  to  the  following  normal  density 
function 


f..(x..)  = 


n/27 


EXP 


ij 


_ o • j 


(3)  Pure  pattern  u has  the  same  probability  to  occur  as  any  other, 
so  that  P(aij ) is  a constant  for  every  pure  pattern  u. . 

Then  we  get  the  Bayes  distance  from  w'  to  <u  as 

,w)  = - in  X 


n 

= - 53  Cln  P<s.|s,.)  + An  q(x.|s.,s.)]  - in  P(<*>) 

i=1  11  ill 


= 53  ( 53  An  f..(x..)>  - tn  P(oj) 

i=1  j=1  1J  1] 


= K + 


n N 

E E 

i=i  j=i 


. X . . 
2 ( 


- w 


13 


■^0^  + m o^D 


where  K is  a constant,  and  as  far  as  discrimination  is  concerned,  we  cat:  de 
fine  the  normalized  square-error  distance  as 
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n N x. . - w. . _ 

B ,<«',«>  = E E —>  + 2 in  a..]  , 

1 i=1  ]=1  °ij  1J 


and  the  (unnormal ized)  square-error  distance  as 


n N 


B-Cm' ,id)  = E £ <*1* 
i=1  j=1  J 


yu> 


2 


which  is  varid  under  a further  assumption  that  all  o..  * 1.  A SPECP  using 
the  normalized  or  unnormalized  least-square-error  (LSE)  distance  criterion 
is  called  a normalized  or  unnormalized  LSE  SPECP.  These  two  kinds  of  LSE 
SPECP  for  tree  languages  have  been  used  by  Tsai  and  Fu  C5]  for  the  segmenta- 
tion and  recognition  of  textures  corrupted  with  random  noise,  and  the 
results  show  their  applicability  with  the  normalized  LSE  SPECP  better  than 
its  unnormalized  version. 
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4.  Bayes  Error-Correcting  Recognition  System  - Hybrid  Pattern  Classifier 
Given  m pattern  classes  of  pure  images  and  their  pattern 

grammars  Gj,...,  Gm,  after  a given  input  observed  pattern  w is  parsed  by 

all  the  Bayes  SPECP  of  the  grammars,  we  get  a set  of  minimum  Bayes  distances 
B(u>,C,j),  BO^Cj),...,  B(<u,Cm>.  Actually,  these  distances  are  just  the  nega- 
tive logarithms  of  the  conditional  probabilities  or  densities  of  w given 
that  u « C.j , or 

p(w|C.)  = EXPC-  B(u,C.):  , 

i = 1,2,. ..,m.  Our  classification  problem  is  to  assign  m to  one  of  these  m 
classes,  which  has  a highest  possibility  to  accept  ui  as  its  observed  pat- 
tern. 

Again,  we  can  apply  the  Bayes  decision  rule  to  get 

P(C  |u)  = max  P(C.|u)  decide  u ♦ C , 
i=1,2,...,m  1 1 


or 

P(ui|  C )P(C  ) = max  p(w|  C.  )P(C . ) decide  w - C.  , 

1 1 i=1,2,...,m  1 1 1 

where  PC Ci ) is  the  a priori  probability  for  pattern  class  C^  i = 1,2,...,m. 
He  call  this  interclass  Bayes  classifier  together  with  the  intraclass  Bayes 
SPECP  a Bayes  error-correcting  recognition  system,  compared  to  the 
max i mum- l i kel i hood  cl assi fication  system  set  up  originally  by  Fung  and  Fu 
C33.  Such  a Bayes  error-correcting  recognition  system  essentially  has  also 
been  proposed  by  Lu  and  Fu  C20D  and  Fung  and  Fu  C18D,  but,  as  mentioned  in 
Section  3.5,  the  error-correcting  capability  for  substitution  errors  of 
their  system  can  only  take  care  of  syntactic  local  deformations.  The  pro- 
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posed  system  here  can  be  considered  as  a generalization  of  theirs.  Note 
that  in  the  proposed  system,  the  Bayes  decision  rule  has  been  used  twice  for 
recognition  of  observed  pattern  primitives  and  for  classification  of  the  en- 
tire observed  pattern,  and  SPECP  are  used  to  perform  the  stochastic  syntax 
parsings  of  input  pattern  structural  representations.  So  the  recognition 
system  can  be  regarded  as  a hybrid  pattern  classifier  because  advantages  of 
both  syntactic  and  statistical  pattern  recognition  techniques  have  been 
util ized. 

Computationally,  this  system  requires  more  computer  time  in  computing 
the  Bayes  distances  during  parsing  if  both  syntactic  or  semantic  local  de- 
formations are  involved,  but  it  saves  some  computer  time  by  avoiding  thres- 
holding continuous  semantic  information  existing  in  the  primitives. 

Compared  with  the  syntactic  recognition  approach  using  stochastic  gram- 
mars only  C7,153,  the  proposed  deformational  scheme  can  be  regarded  as  a 
special  type  of  stochastic  transformational  grammars  which  is  expected  to 
handle  complex  noisy  input  patterns  where  simple  stochastic  grammars  may  not 
be  adequate  to  apply  C3]. 

4.1  An  Illustrative  Example 

A complete  example  for  string  languages  is  given  in  this  section  to  il- 
lustrate the  applicability  of  the  proposed  Bayes  error-correct ing  recogni- 
tion system  and  its  superiority  to  other  error-correcting  systems  which  han- 
dle continuous  semantic  information  by  thresholding  it  into  finite  discrete 
cases. 

Assume  that  we  have  two  pure  pattern  classes.  One  pattern  class 
consists  of  two  equilateral  triangles  as  shown  in  Fig.  1(a),  and 

the  other  class  C 2 consists  of  two  other  different  equilateral  triangles 
'°21'  “22  aS  shown  ’n  I**5*  • T^e  Primitives  used  which  are  fixed-length 
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Fig. 1(a)  Fig. 1(b)  Fig. 1(c) 


Also  assume  the  following  probability  values:  P(C^)  = 0.5,  PCC^)  = 0.5, 
P(u11|C1)  = 0.60,  P<b>12l C1>  = 0.40,  P(oo21|C2)  = 0.80,  P(u22|C2)  = 0.20.  Two 
stochastic  pattern  grammars  6^,  G2,  consistent  with  these  probabilities  for 
C.j,  C2,  respectively,  are  as  following: 

G1  " (VN1'VT1'P1'V 

VN1  = <A,B,C,D,A1,B1,C1,01> 


^a1  ,a2* 

f 3j} 

0.6 

S1 

•¥ 

AD 

(1) 

0.4 

S1 

-¥ 

1.0 

AiDi 

(2) 

D 

■¥ 

BC 

(3) 

1.0 

A1 

♦ 

AA 

(4) 

1.0 

D1 

♦ 

1.0 

B1C1 

(5) 

B1 

♦ 

BB 

(6) 

1.0 

C1 

♦ 

CC 

(7) 

1.0 

A 

♦ 

a1 

(8) 

1.0 

B 

a2 

(9) 

1.0 

C 

♦ 

a3 

(10) 
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and 


62  = <VN2'VT2/P2'V 


1 

N2 

= 

.B,C, 

'B1/C1 

l 

T2 

= <b1 

l'b2' 

,b3> 

P2 

• 

0.8 

S2 

♦ 

AD 

(1) 

c 

0.2 

S2 

♦ 

Vi 

(2) 

D 

♦ 

BC 

(3) 

1.0 

A1 

AA 

(A) 

1.0 

°1 

♦ 

1.0 

Vi 

(5) 

B1 

«♦ 

BB 

(6) 

1.0 

C1 

CC 

(7) 

1.0 

A 

♦ 

b1 

(8) 

1.0 

1 

B 

♦ 

1.0 

b2 

(9) 

C 

*♦ 

b3 

(10) 

To  use  the  Bayes  SPECP  of  Algorithm  1 for  illustrative  purpose,  the 
above  two  grammars  are  inferred  in  their  context-free  forms,  although 
simpler  finite-state  grammars  can  certainly  be  used.  They  are  also  in  Chom- 
sky normal  form. 

Now  assume  that  each  pattern  (i  = 1,2,  j = 1,2)  is  subject  to  both 

syntactic  and  semantic  local  deformations  such  that  each  line  segment  in  oi . . 

i J 

is  deformed  independently.  The  semantic  local  deformation  is  induced  only 
on  the  direction  of  each  line  segment.  And  each  line  segment  can  be  syntac- 
tically deformed  into  a curve  segment  with  a fixed  curvature  and  a fixed 
length  but  with  a variable  direction.  So  we  can  use  the  2-tuple  (L,t)  and 
(C,e)  to  characterize  the  pure  primitives  - line  segments,  and  the  deformed 
primitives  - curve  segments,  respectively,  where  L and  C are  syntactic  sym- 


j 

J 

i 
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bols,  and  a denotes  the  one-dimensional  semantic  vector  the  direction  of 

the  primitives  with  respect  to  x-axis.  So  we  have  all  the  2-tuples  for  the 
pure  primitives  shown  in  Fig.  1(c)  as 


I 

( 


a7  = (L,30°)  b1  = (L,0°) 

a2  = (L,150°)  b2  = (L,120°) 

a3  = <L,270°)  b3  = (L,240°) 


And  we  assume  that  each  a^  (i  = 1,2,3)  can  be  deformed  syntactically  into  a 
curve  segment  with  probability  0.1,  and  that  each  b.  (i  = 1,2,3)  can  be  de- 
formed syntactically  into  a curve  segment  with  probability  0.13.  Further- 
more, each  line  or  curve  segment  is  semantically  deformed  on  its  direction  9 
approximately  with  a normal  distribution  as  shown  in  the  following  data  (for 
notation,  see  Sec.  2.5): 


Da.  = {ai1  = ai  = (L'9a.)  ' ai 2 = (c^ea.)> 
where  ea  = 30°  + (i-1)  • 120° 

<3  • 

1 

with  p(a.1|a.)  = 0.9  , p(a.2|a.)  = 0.1 
i = 1*2,3. 


°b.  = 

1 

{bi1  = bi  = <L'eb. 

i 

where 

e.  = (i-1)  • 120( 

D • 

wi  th 

p(b.1|bi)  = 0.87  , 

i = 1,2,3. 

■>  *•  ' . 
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\ = Cai jk,aijk  = (Sj'V  ' l9k  - °a . I < 40O+> 

where 

' — 1/2/3,  j — 1,2, 

S.  = L when  j = 1 

= C when  j = 2 , 
and 


q(a1jklajra.) 


\/2Tr 


EXPC-  ~ 


a 


9.-6 
k a. 


with 


°a  = 8°  , 6g  = 30°  + (i-1 ) • 120°  . 
i 


Vj  * <bfik|b1jk  3 <Sj'V  ' l*k 


where  i=1. 

2,3  , 

3 

= 1,2 

CO 

u. 

II 

r~ 

when 

3 

= 1 

= c 

when 

j 

= 2 , 

8b  I i.  40°,J 


and 


q(b. 


i jk^bi j'bi) 


EXPC- 


2 

J 


tMathematical ly,  there  is  no  limitation  on  the  value  of  9.,  but  for  com 
putational  convenience,  let's  assume  so.  * 
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and 

°b  = 10°  ' eb  = <i_1>  * 120°  ’ 


The  6 semi-pure  primitives,  i.e.,  the  6 curve  segments  corresponding  to 

a12'  a22'  a32'  an<^  b12'  b22'  b32  are  s*10Wn  ™ F^9*  2(a).  Two  possible  ob- 
served patterns  deformed  from  i^,  are  shown  in  Fig.  2(b)  and  Fig.  2(c), 
respectivel y. 


Now  suppose  we  want  to  classify  the  deformed  pattern  <*)•  shown  in  Fig. 
2(c)  with  the  following  string  representation: 


= C1C2C3C4C5C6 

where 


! = <L,15°), 

O 

II 

(L,135°) 

2 = (C,15°), 

c5  = 

(L,255°) 

3 = (C,135°) , 

C6  = 

(C,255°) 

At  first,  we  apply  the  Bayes  SPECP  for  grammar  and  Gg  to  u'  respectively, 
by  using  the  algorithm  proposed  in  Sec.  3.3.  When  finished,  we  get  the  fol- 


’ 
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lowing  two  parse  tables  T?  for  G1  and  G2,  respectively.  Since  S1  is  in 

i 

°f  and  S2  ’n  t16  °*  T2'  ’s  accePted  by  classes  and  C2  with 


minimum  Bayes 

(S.,,36.68,2) 

distances  d. 

= 36.68  and  d2  = 34.19,  respectively. 

* 

♦ 

* 

(D^, 23. 84, 5) 

* 

♦ 

♦ 

(A^ll.92,4) 

<t> 

(Br  11.92,6) 

♦ 

(C1 ,1 1 .92,7) 

(A,4.86,8) 

(A,7.06,8) 

(B,7.06,9) 

1 

(C, 7. 06, 10) 

(S?,34.19,2) 

* 

4> 

(Parse 

Table  T^ 

$ 

* 

. . . 

(Dr21.72,5) 

$ 

4 

4 

$ 

'(A.,,10.86,4) 

4> 

(B^, 10. 86, 6) 

♦ 

(C^, 10. 86. 7) 

\ 

| (A,4.48,8) 

1 

(A,6.38,8) 

(B,6.38,9) 

(B,4.48,9) 

(C,4. 48,10) 

(C,6. 38,10) 

(Parse  Table  T2> 


Next,  we  apply  the  interclass  Bayes  decision  rule  to  get 


- - - i 
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P(C1 1 OJ* ) = p(o)'  I C1  )P(C1 ) 

= EXP<-36.68)  • 0.5 
= 5.88  x 10“17 


P(C2|w')  = EXP(-34.19)  • 0.5 
= 70.87  x ID'17 

So  we  decide  that  w'  belongs  to  C2-  This  completes  our  illustrative  example 
for  the  proposed  Bayes  error-correcting  recognition  system. 

In  the  following,  we  threshold  the  continuous  6 values  into  intervals 
as  is  usually  done  by  other  error-correcting  schemes,  and  show  how  contrary 
decision  can  be  made  for  the  previous  input  pattern  (d*.  Since  the  proposed 
Bayes  recognition  system  always  gives  optimum  decision  in  the  Bayes  sense, 
we  thus  have  shown  its  better  performance  than  other  systems  using  thres- 
holding approaches  on  continuous  sematic  information. 

If  we  threshold  9 values  starting  from  0O+  in  steps  of  20°  for  class 

C^,  and  from  30°+  in  steps  of  20°  for  C 2,  then  Dg  and  can  be  changed 

aij  Dij 

to  the  fol lowing : 

Dg  = taijk|K  = 1,2, 3, 4,  a.jR  = (S.,eK>,  (K-2>*20°  < < (K-1)*20°> 

with  discrete  probabilities 

(0.01,  K = 1,4 

q(aijk,aij^ai)  = | 0.49,  K = 2,3, 


tStarting  from  different  points  to  threshold  is  just  for  convenience, 
because  the  directions  for  a^,  b^  are  0°  and  30°. 


< 
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<bijki 


1, 2,3,4,  b... 

i J k 


(S. ,e„),  <K-2)*20°  < e -e.  < (K-1)*20°> 

] K — Kb.— 


with  discrete  probabilities 


q(bijklbij'bi) 


0.02,  K = 1,4 
0.48,  K = 2,3, 


with  Sj  the  same  as  defined  previously.  And  by  convention,  only  the  follow- 
ing probability  values  are  used  in  parsing  [33: 


f j((a.  ..|a.)  = q(a...|a.,dV  p(a..|a.) 

^ i j k ' i ijk'  i>  -/  K ij 1 i 


^<b,jklb,>  . b<bijk|bi.b,jY  ptb^lb,) 


0.009, 

j = 1/ 

K = 1,4 

0.441, 

i = 1, 

K = 2,3 

0.001, 

j = 2, 

>* 

II 

0.049, 

(\J 

II 

K = 2,3 

0.0174, 

j = 1, 

K = 1,4 

0.4176, 

j = 1/ 

K = 2,3 

0.0026, 

i = 2, 

K = 1,4 

0.0624, 

i = 2, 

K = 2,3 

i = 1,2,3.  The  previous  data  shows  that  each  a^  or  b^  can  be  deformed  into 
8 different  observed  primitives  with  different  probabilities,  in  which  four 
are  line  segments  and  the  other  four  are  curve  segments. 

Now  again  use  the  Bayes  SPECP  proposed  in  Sec.  3.3  for  G^,  G^  to  parse 

I 

<d  , respectively.  Note  that  after  thresholding  the  9 values  in  w'  and 
transforming  into  string  representations,  we  get 

= a113a123a223a213a313a323 
for  class  C^,  or 

= b112b122b222b212b312b322 

for  class  Also  note  that  the  term  Cm  p(s^|tj)  + tn  q(x.|tj,s.)3  in  Al- 
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gorithm  1 should  be  replaced  by  in  ^(c^la^)  before  the  algorithm  is  applied 

to  our  discrete  case  here,  where  c.  = a...  or  b...  now. 

1 1 ) * 1 j * 


(S^ ,1 2.44,2) 

* 

* 

* 

♦ 

(Dr7.68.5) 

4> 

♦ 

* 

i 

♦ 

(Ar3.84,4) 

♦ 

(B  ,3.84,6) 

* 

4> 

((^,3.84,7) 

(A,0.82,8) 

(A,3.02,8) 

(B,3.02,9) 

(B, 0.82, 9) 

(C, 0.82, 10) 

(C, 3. 02, 10) 

(Parse  Table  ) 


(S2,12.53,2) 

♦ 

♦ 

♦ 

* 

(D.j,7.28,5) 

♦ 

♦ 

♦ 

♦ ] 

1 

(A.^3.64,4) 

♦ 

(Br3.64,6) 

i 

♦ j(Cr3.64#7) 

' (A,0.87,8) 

i 

(A,2.77,8) 

(B,2.77,9) 

■ ■ — 

(B,0.87.9)  (C, 0.87. 10) 

(Parse  Table  T„) 


From  the  above  tables,  we  get 
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PCC1 1 a>* ) = EXP(-12.44)  • 0.5 
= 1.98  x 10"6 

PCC2| oi* ) = EXP(-1 2.53)  • 0.5 
=1.81  x 10“6 

So  we  decide  u'  belongs  to  ! 

A careful  study  reveals  that  such  contrary  conclusion  to  the  previous 
Bayesian  decision  w'  ♦ is  due  to  the  rough  thresholding  used.  Using 
smaller  intervals  in  thresholding  will  improve  the  result,  but  never  be 
better  than  our  proposed  system  which  has  minimum  probability  of  errors  for 
recognition  of  primitives  due  to  the  use  of  the  Bayes  rule  in  the  error- 
correcting  parser. 


i 


\ 
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1 5.  Concluding  Remarks 

i 

’ Bayes  error-correcting  recognition  systems  using  Bayes  error-correcting 

' parsers  and  Bayes  interclass  decision  rule  have  been  proposed  both  by  Fung 

and  Fu  [18]  and  by  Lu  and  Fu  [20] . The  proposed  system  in  this  report  can 
be  considered,  from  the  viewpoint  of  local  deformations,  as  a generalization 
of  theirs  in  the  aspect  of  semantic  information,  which  is  more  relevant  for 
practical  pattern  classifications  where  both  structural  and  numerical  infor- 
mations are  available  for  primitive  discrimination,  as  emphasized  by  several 
investigators  [13,19,6].  Further  investigations  should  be  directed  to  in- 
clude error-correcting  capability  for  structural  deformations  under  the  for- 
malism of  the  proposed  deformational  model  and  thus  set  up  a more  complete 
error-correcting  recognition  system  for  more  practical  applications. 


I 


I 
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